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Abstract 

This study was conducted with 68 Iranian students studying at the Alborz Institute of Higher Education. 
The participants' majors were Teaching English as a Foreign Language (TEFL; n = 23), English Language 
Literature (ELL; n = 22), and English Language Translation (ELT; n= 23). DIALANG self-assessment scales, 
consisting of 107 statements with Yes/No responses, were used in this study. DIALANG is an online 
assessment system used for language learners who want to obtain diagnostic information about their 
language proficiency (Council of Europe, 2001). Results indicated that ELL students had the highest 
overall ranking, whereas TEFL students received the lowest overall ranking for listening skills. ELL 
students had the highest reading skill scores while ELT students demonstrated the lowest scores. ELL 
students ranked highest in writing ability whereas TEFL students rated lowest in writing skill. Kruskal- 
Wallis analysis revealed that there was no statistically significant difference in listening and reading skills 
across the three majors. One-way between-groups ANOVA did demonstrate a statistically significant 
difference in the writing self-assessment statements for the three groups. Implications and directions for 
future research with DIALANG are provided based on results from the study. 

Resume 

Cette etude a ete menee avec 68 etudiants iraniens qui etudient au Alborz Institute of Higher Education. 
Les matieres principales des participants ont ete l'enseignement de I’anglais comme langue etrangere 
(TEFL; n = 23), la litterature d'expression anglaise (ELL; n = 22), et la traduction de la langue anglaise 
(ELT; n = 23). Les echelles d’auto-evaluation DIALANG, composees de 107 enonces avec reponses 
oui/non, ont ete utilisees dans cette etude. DIALANG est un systeme devaluation en ligne utilise pour 
les apprenants qui veulent obtenir des informations diagnostiques de leur maftrise de la langue (Conseil 
de I’Europe, 2001). Les resultats ont indique que les etudiants ELL avaient le rang global le plus eleve 
tandis que les etudiants TEFL ont repu le plus bas classement general pour la capacite d’ecoute. Les 
etudiants ELL avaient les scores les plus eleves des competences en lecture tandis que les etudiants ELT 
ont demontre les scores les plus faibles. Les etudiants ELL ont obtenu le classement le plus eleve dans 
I'aptitude a ecrire, tandis que les etudiants TEFL ont obtenu les notes les plus basses dans I'aptitude a 
ecrire. L'analyse de Kruskal-Wallis a revele qu'il n'y avait pas de difference statistiquement significative 
dans la capacite d'ecoute et les competences en lecture dans les trois matieres principales. L'analyse 
simple de la variance entre les groupes ANOVA a demontre une difference statistiquement significative 
dans les enonces d'auto-evaluation de I'ecriture pour les trois groupes. Les repercussions et les 
orientations pour la recherche future avec DIALANG sont fournies en fonction des resultats de I'etude. 
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Introduction 

The introduction of alternative classroom assessment strategies in the early 1990s opened up new 
opportunities for language courses, language education, and language assessment (Esfandiari & Myford, 
2013). Examples of alternative kinds of assessment include self-assessment (SA), peer-assessment, 
classroom observations by teachers, student portfolios, and interviews (Butler & Lee, 2010). Andrade, 
Du, and Mycek (2010) defined formative assessment as "a process during which students reflect on the 
quality of their work, judge the degree to which it reflects explicitly stated goals or criteria, and revise 
accordingly" (p. 3). Suzuki (2009) stated that the advantages of alternative assessment include the 
following: (a) quick administration; (b) students' involvement in the assessment process; (c) 
enhancement of students' autonomy in language learning; and (d) increase of students' motivation for 
language learning (Blanche & Merino, 1989; Brown & Hudson, 1998). 

As technology evolves and online learning opportunities expand, language assessments such as those 
provided by DIALANG are easily administered. DIALANG is a Web-based assessment system used for 
language learners who want to obtain diagnostic information about their language proficiency (Council 
of Europe, 2001). DIALANG evaluates reading, writing, listening, grammar, and vocabulary skills. 
Speaking is not evaluated through DIALANG. 

( http://www.lancaster.ac.uk/researchenterprise/dialang/about.htm ). 

The purpose of this study was to examine language competency through the Web-based DIALANG 
language assessment tool across different majors at one university in Iran. The participants' majors were 
Teaching English as a Foreign Language (TEFL), English Language Literature (ELL), and English Language 
Translation (ELT). This study is important because there is a clear link between language proficiency and 
academic success (Sahragard, Baharloo, & Soozandehfar, 2011). While students were in the fourth year 
of their programs at the university where this study was undertaken, they had problems with 
comprehension of English language course materials and their language proficiency was less than 
adequate. This put these students at risk for not completing their programs and graduating from the 
university. 


LITERATURE REVIEW 


Learner Self-Assessment 

Moritz (1996) considers self-assessment in foreign language education as a non-traditional form of 
assessment, and a logical component of both learner-centered pedagogies and more self-directed 
(autonomous) learning programs. Todd (2002) refers to self-assessment as an essential component in 
the learning experience of a self-directed learner. Conceptually, self-assessment is supported by 
theories of cognition, constructivism, and learner autonomy, especially those of Piaget and Vygotsky 
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(Chen, 2008). Deakin-Crick et al. (2005) also suggest that self-assessment builds on students' ownership 
of their language learning processes, self-awareness, and responsibility for the language learning 
experience. They further suggest that engaging learners in the assessment of their own language 
learning is related to theories of learning, acceptance of the importance of motivation for learning, and 
the value of non-cognitive results. Similarly, Boud (1995) notes that self-assessment originated in the 
context of autonomous learning or learner independence. Brown (2004) also suggests that the principles 
of autonomy, intrinsic motivation, and cooperative learning comprise the theoretical justifications for 
self-assessment. Similarly, Alderson and McIntyre (2006) argue that implementation of self-assessment 
arises out of belief in student autonomy as an educational goal. 

A great number of advantages have been identified for self-assessment including the following: raising 
the level of awareness about the learning process (Benson, 2001; Blanche & Merino, 1989; Kato, 2009; 
Oscarson, 1989; Todd, 2002); promotion of learner autonomy (Cram, 1995; Dann, 2002; Kato, 2009; 
Oscarson, 1989,1997; Paris & Paris, 2001); setting of realistic goals and directing personal learning 
(Abolfazli Khonbi & Sadeghi, 2012; Blanche & Merino, 1989; Butler & Lee, 2010; Oscarson, 1989); 
discernment of individual patterns of strengths and weaknesses (Blue, 1994; Esfandiari & Myford, 2013; 
Saito & Fujita, 2004); increasing learner motivation (Barbera, 2009; Paris & Paris, 2001; Sadler & Good, 
2006; Todd, 2002); increased effect on students' learning over time (Oscarson, 1989). Other benefits 
include expansion of assessment types (Oscarson, 1989); monitoring personal progress and reflection 
on what should be done (Barbera, 2009; Butler & Lee, 2010; Esfandiari & Myford, 2013; Hana Lim, 2007; 
Haris, 1997; Peden & Carroll, 2008; Sadler & Good, 2006; Sally, 2005); facilitation of democratic learning 
processes (Oscarson, 1989; Shohamy, 2001); taking responsibility for learning (Barbera, 2009; Esfandiari 
& Myford, 2013; Paris & Paris, 2001; Peden & Carroll, 2008; Sadler & Good, 2006); and promotion of 
learning (Black & Wiliam, 1998; Oscarson, 1989). 

In addition, the factors reported to influence the implementation of self-assessment are clear criteria 
(Airasian, 1997; Falchikov, 1986; Orsmond et. al, 2000; Stiggins, 2001); training before the actual 
assessment (AlFallay, 2004; Chen, 2006; Wiggins, 1993); sufficient practice (McDonald & Boud, 2003; 
Nicol & Macfarlane-Dick, 2006; Orsmond et al., 2000; Stefani, 1998; Taras, 2001). Teacher intervention 
and feedback (Orsmond et al., 2002; Stanley, 1992; Taras, 2003) and cultural and educational context 
(Oscarson, 1997) are also factors reported in the literature. 

Of the techniques for involving learners in self-assessment, objectively-marked discrete-point tests of 
linguistic knowledge, rating scales, and checklists are three traditional approaches to self-assessment of 
language ability (Brindley, 1989; North, 2000; Oscarson, 1989). Blanche (1988) also identifies techniques 
such as checklists (e.g., questionnaires /'can-do' statements), learners' diaries, learners' reports on real- 
life communication, self-ratings of certain instructional objectives, and retrospective self-assessment 
where learners report on their success or lack of success when communicating with native speakers 
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outside the classroom and in other contexts. Of all these techniques, rating scales with holistic 
descriptors are the most commonly used self-assessment technique (Brindley, 1989; North, 2000). 

The relationship between self-assessment and learning and teaching contexts is another issue. Since 
self-assessment can potentially modify the power relationship between students and teachers, some 
teachers may find it a challenge to their authority (Towler & Broadfoot, 1992). Hamp-Lyons (2007) 
described two conflicting cultures of assessment: an exam culture and a learning culture. A focus in a 
learning culture is individual learners' improvement in learning, while an exam culture concentrates on 
learners' mastery of language proficiency in relation to norms or groups. Hamp-Lyons further states the 
transition from an exam culture to a learning culture is a complex process, requiring consideration of 
teachers' viewpoints in order to make the transition successful. 

Closely related to the problem of overestimation, as Esfandiari and Myford (2013) argue, is that of the 
accuracy, validity and reliability of self-ratings may not be an accurate reflection of individual abilities. 
Chen (2008) states that "literature on student self-assessment often discusses its validity and reliability, 
or effectiveness, as an assessment tool based on agreement between self- and teacher scorings" (p. 

253). For instance, Topping (2003) notes that (a) scores that students gave to themselves seemed to be 
higher than scores that teachers gave to them, and (b) self-assessment based on the students' 
perceptions of their levels of effort rather than their levels of achievement were particularly unreliable. 
Additionally, self-assessments appear to be more unreliable when students rate their own performance 
than when they assess their own learning products (Segers & Dochy, 2001). However, Bachman and 
Palmer (1989) argue that self-assessment tends to be a reliable and valid instrument for communicative 
language ability though learners at a lower level compared to the more proficient ones may find the 
self-assessment difficult. Likewise, LeBlanc and Painchaud (1985) and Pierce et al. (1993) note that self- 
assessment tends to be a valuable and reliable indicator of language proficiency. 

According to Butler and Lee (2010), the inherent subjectivity of self-assessment as a measurement tool 
has traditionally been reported as a threat to its validity. As a result, research analyzing the 
measurement aspect of self-assessment in foreign and second language education has focused on the 
validity of self-assessment (Butler & Lee, 2010). Butler and Lee further state that, "such validation 
studies have often examined the correlations between self-assessment scores and scores obtained 
through various types of external measurements such as objective tests, final grades, and teachers' 
ratings" (p. 7). In studies on validation (Blanche & Merino, 1989; Oscarson, 1997; Ross, 1998), the results 
are mixed, and a number of factors have been identified regarding the variability in self-assessment 
results which can be broadly categorized as follows:, "(1) the domain or skill being assessed; (2) 
students' individual characteristics; and (3) the ways in which questions and items are formulated and 
delivered" (p. 7). 
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Web-Based Language Testing 

With the expansion of distance and online learning, rapid development of IT and the Internet, and 
increasing accessibility and availability of both hardware and software, computers are now widely used 
in language education (Davies, 2003) and are an increasingly important factor of change in education 
(Alvarez & Rice, 2006). For instance, as Micea (2005) points out, e-learning can be viewed as a means of 
facilitating three significant outcomes: increased equity, improved productivity and improved 
innovation and competitiveness, and improved and consistent rates of lifelong learning. In this light, 
Warschauer and Meskill (2000) argue that, by using new technologies in the language classroom, 
teachers can better prepare students for the kinds of international cross-cultural interactions that are 
increasingly required for success in academic, vocational, and personal life. 

The use of computer technology in teaching languages has dramatically increased worldwide over the 
past decade (Chen, Belkada, & Okamoto, 2004; Hubbard & Levy, 2006; Son, 2008). As a result, a fairly 
large literature exists on the effectiveness of computer-assisted language learning on language 
development. The findings of these studies, according to Silye and Wiwczaroski (2002), suggest that 
both language learners and instructors have generally positive attitudes toward using computers in the 
language classroom. Less is known, however, about the more specific use of computers in the language 
testing area. 

The Web, as a recently emerged instrument of language assessment (Silye, & Wiwczaroski, 2002), 
greatly expands the availability and accessibility of computer-based testing with all its potential 
advantages. Additionally, it will undoubtedly become a main medium of test delivery in the near future. 
Silye and Wiwczaroski further argue that a Web-based test is an assessment instrument that is written in 
the "language" of the Web, HTML. The test itself consists of one or several HTMLfile(s) located on the 
tester's computer and the server. It can be downloaded to the test taker's or client's computer. 
Downloading can occur for the entire test at once, or item by item. The client's computer makes use of 
Web-browser software such as Google Chrome or Microsoft Internet Explorer, to interpret and display 
the downloaded HTML data. Test takers respond to items on their computers and may send their 
responses back to the server. Alternately, their responses can be scored in real time by means of a 
scoring script administered by the person providing oversight for the test. A script can then be 
generated to provide immediate feedback, adaptation of item selection to the test taker's needs, and/or 
computation of a score to be displayed after completion of the test. The same evaluation process can 
take place on the server by means of server-side programs. 

It is obvious that there are various advantages linked to the use of Web-based tests. Roever (2001) 
suggests three main advantages: (a) flexibility in space and time, which is probably the biggest 
advantage of a Web-based test, since all that is required to take a Web-based test is a computer with a 
Web browser and an Internet connection. In addition, test takers can take this test wherever and 
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whenever it is convenient, and test developers can share their test with colleagues all over the world 
and receive feedback, (b) Web-based tests are comparatively easy to write and require only a free, 
standard browser for their display; and (c) a Web-based test is very inexpensive for all parties 
concerned, including testers and test takers. Alvarez and Rice (2006) argue that Web-based tests 
provide immediate feedback, improved security, ways to store test results for further analysis, storage 
of large amounts of items, multimedia presentations, grading objectivity, and self-pacing for the test 
taker. 

Despite the numerous advantages of Web-based tests, some limitations are suggested. For instance, 
Silye and Wiwczaroski (2002) contend that the greatest limitation of these types of tests is their lack of 
security with regard to item confidentiality and cheating. Alvarez and Rice (2006) suggest security 
and/or technical problems such as browser incompatibilities and server failure as other drawbacks in the 
use of Web-based tests. They further note that computers lack human intelligence to assess direct 
speaking ability and freely written compositions. 

DIALANG as a Web-Based Assessment Tool 

DIALANG, as Alderson (2005) states, is a computer-based learner-centered diagnostic language 
assessment system, which is intended for language learners who want to obtain diagnostic information 
about their proficiency. Like other Web-based tests, DIALANG offers many advantages such as flexibility 
in space and time, easiness to write, and affordability (Alvarez & Rice, 2006). Brantmeier and 
Vanderplank (2008) assert that for the DIALANG project, Alderson specifically excludes self-assessment 
from high stakes testing and regards self-assessment more as a valuable descriptive and explanatory 
tool for providing feedback to learners. 

DIALANG is a large and complex project that is funded by the European Union (Escribano & McMahon, 
2010) and developed by a team of experts at the University of Lancaster (Klimova & Hubackova, 2013). 
According to Chapelle (2006), theoretical and empirical rationales have been taken into account in the 
design and development of the DIALANG. DIALANG, as Alderson (2005) states, contains tests of five 
language skills or aspects of language knowledge (i.e., Listening, Reading, Writing, Grammar, and 
Vocabulary) and due to the constraints on the computer-based testing, the CEFR scales for spoken 
production were ignored (Alderson, 2005). 

DIALANG, according to Alderson (2005), is unique in that it attempts the diagnostic assessment of 14 
European languages: Danish, Dutch, English, Finnish, French, German, Greek, Italian, Portuguese, 
Spanish, Swedish, Irish, Icelandic, and Norwegian. DIALANG's Assessment Framework and the 
descriptive scales used for reporting the results to the users are directly based on the CEFR (Council of 
Europe, 2001). DIALANG Framework, as Haahr and Hansen (2006) assert, summarizes "the relevant 
content of the CEFR, including the six-point reference scale, communicative tasks and purposes, themes 


Suggested Citation: Taghizadeh, M., Alavi, S., & Rezaee, A. (2014). Diagnosing L2 learners 1 language skills based on the use of a 
web-based assessment tool called DIALANG. International Journal of E-Learning & Distance Education, 29(2), 1-28. Available 
online at: http://ijede.ca/index.php/jde/article/view/889/1564 6 






International Journal of E-Learning & Distance Education 
La Revue international de Tapprentissage en ligne et de l'enseignement a distance 


Volume 29(2) 2014 

and specific notions, activities, texts and functions" (p. 78). They further state that DIALANG provides 
learners with various kinds of feedback on the weak and strong points in their language proficiency and 
constructive advice for further learning. Diagnosis offered in DIALANG is not concerned with specific 
language curricula or courses; rather, it is based on the specifications of language proficiency suggested 
in the CEFR. 

According to Haahr and Hansen (2006), the main methodological steps and challenges in DIALANG's 
development of a standard based on the CEFR are (1) awareness of the theoretical assumptions of the 
framework; (2) complementing for the limitations of the framework; (3) explaining the framework to the 
assessment development teams; (4) conservative or innovative items? (i.e., do the test items represent 
a variety of different formats? or are the possibilities of computer-based tests sufficiently utilized?); (5) 
piloting of items; (6) relating test items and resulting scores to the CEFR levels (p. 78). 

DIALANG serves adults who want to know about their levels of language proficiency and who want to 
receive feedback on the weaknesses and strengths of their proficiency (Council of Europe, 2001). 
According to Alvarez and Rice (2006), the main advantage of the DIALANG is perhaps the way in which 
this project provides assessment information: (a) scores are objective, (b) the feedback is immediate, (c) 
instead of a global score, there is identification of the test taker's current language level, (d) it enables 
the planning of curricula, (e) it offers learners' further study opportunities, (f) it meets the learner's 
need for feedback, and (g) it allows for storage of outcomes for later comparisons in order to check 
progress. Alderson (2005) suggests that the value of the DIALANG Framework is in the following areas: 

• It has been possible to gather information from a fairly large number of candidates, at least for 
English, from a wide range of national and linguistic as well as educational backgrounds, (p. 29) 

• The DIALANG system aims to encourage self-diagnosis and self-help and is entirely learner- 
centered (p. 31) 

• For the learner or student, the DIALANG system is a free of charge, no stakes test. (p. 31). 

By contrast, in Brunfaut (2014), Alderson suggests two areas in which the DIALANG has not been 
successful: (a) in some of the languages involved in the program, items are limited, and (b) the theory 
behind this test is fairly traditional. Alvarez and Rice also argue that a real challenge for Web-based tests 
is assessment of productive skills (i.e., speaking and writing). With respect to DIALANG, they state that, 
at the moment, DIALANG lacks a system to assess writing and listening in terms of full sentences, 
paragraphs, and essays. They further note that a project to consider these deficiencies is currently taking 
place, and they present some examples of the items produced in the experimental phase of this project. 
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The system can be downloaded from the DIALANG Website at www.dialang.org. Users have to be 
connected to the Internet to take the tests because test items and interface texts come from the 
DIALANG servers and are cached on the individual user's machine (Alderson & Huhta, 2005). Alderson 
and Huhta (2005) and Alvarez and Rice (2006) describe the procedure for taking the DIALANG test as 
follows: the procedure to follow in order to take the test does not need a lot of computer skills. The first 
thing to do is to choose the language in which instructions will be given. After that, the language and 
skill to be tested are selected. Then, test takers have the option to take a vocabulary placement test. 

The system uses the vocabulary test score to select a test of suitable difficulty for the test taker. The 
next step is an optional battery of self-assessment statements about listening, reading, and writing skills 
before proceeding to the test itself. If users have completed both the vocabulary placement test and the 
self-assessment, the system combines the two results to decide which level of test to administer. Next, 
learners take the test for the skill chosen previously. The items are frequently multiple choice questions 
but sometimes a word or phrase has to be written. Finally, DIALANG presents the results in terms of the 
CEFR level, answer verification, score in the placement test, self-assessment feedback, and advice. 

Self-assessment, as Alderson (2005) notes, is the central component of DIALANG, and the self- 
assessment statements are taken directly from the CEFR; specifically, they include a large number of 
'can do' statements for each skill and at each level (Haahr & Hansen, 2006). However, the wording of 
CEFR statements was changed from 'can do' to 'I can' while some statements were also simplified for 
use with certain audiences. DIALANG also developed a number of 'can do' statements for grammar and 
vocabulary (Haahr & Hansen, 2006). 

As Haahr and Hansen (2006) point out, the self-assessment statements in the DIALANG Framework 
underwent a piloting procedure similar to the test items and the correlations between their calibrated 
values of difficulty and the original CEFR levels were very high (0.91-0.93). This result indicates that the 
DIALANG self-assessment statements correspond closely to the original CEFR levels. It was also revealed 
that the self-assessment statements were equivalent across different languages (Alderson, 2005). 

Research Aims, Purpose and Questions 

While research has been conducted on self-assessment in language learning (Alderson, & McIntyre, 
2006; Chen, 2008; Deakin-Crick et al., 2005; Escribano & McMahon, 2010; Hana Lim, 2007; Kato, 2009; 
Suzuki, 2009; Wagner & Lilly, 1999), very little empirical examination has been undertaken using 
DIALANG to assess language proficiency, particularly in the Iranian context. The purpose of this study 
was, therefore, to examine second language (L2) learners' self-assessed level in listening, reading, and 
writing skills in terms of SA statements of the DIALANG project: 

1. Is there any statistically significant variability in the L2 learners' performance in the listening 
section of the DIALANG scale? 
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2. Is there any statistically significant variability in the L2 learners' performance in the reading 
section of the DIALANG scale? 

3. Is there any statistically significant variability in the L2 learners' performance in the writing 
section of the DIALANG scale? 


METHODS 


Participants 

A convenience sample of 68 Iranian Bachelor of Arts students studying at an institute of higher learning 
in Iran participated in the study. The majors of the participants were Teaching English as a Foreign 
Language (TEFL; n = 23), English Language Literature (ELL; n = 22), and English Language Translation (ELT; 
n= 23). Participants were male (27%) and female (73%) and ranged in age from 18 to 27. Participants 
were all in their fourth year of study. Approval to conduct the study was obtained at the university prior 
to data collection. 

Instruments 

DIALANG scales of self-assessment were used in this study. They consisted of 107 statements with "yes" 
or "no" responses. The underlying constructs of the scales included reading skill (31 self-assessment 
statements), listening skill (43 self-assessment statements), and writing skill (33 self-assessment 
statements). Based on the Common European Framework of Reference for Languages (CEFRL) 
(https://www.eui.eu/Documents/ServicesAdmin/LanguageCentre/CEF.pdf), the levels are understood as 
follows: at the A level, the learner is considered to be at the basic level of proficiency; at the B level, the 
learner is considered to be at the independent user level; at the C level, the learner is considered to be 
at a proficient level. More information regarding the CEFRL can be found in Appendix A. Detailed 
information about the different levels of each DIALANG construct is presented in Table 1. 

Cronbach's alpha was used to estimate the consistency of the participants' responses to the DIALANG 
scales. Reliability coefficients for the listening, reading, and the writing scales were 0.90, 0.88, 0.88, 
respectively, indicating that the scales demonstrated high reliability. 
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Table 1. The Number of Self-Assessment Statements for the Levels of the DIALANG Scales 


Skills 




Levels 




A1 

A2 

B1 

B2 

Cl 

C2 

Total 

Listening 

4 

10 

10 

9 

9 

1 

43 

Reading 

5 

9 

8 

6 

2 

1 

31 

Writing 

6 

7 

7 

4 

5 

4 

33 


Procedures 

The study was conducted at the end of the Fall Semester in 2013. The researchers provided the 
participants with some background about the DIALANG self-assessment experience and explained the 
goal of the study in order to encourage their participation. Then, the Persian version of the DIALANG 
self-assessment scales was administered to the students. They were asked to complete the self- 
assessment in 100 minutes. 

Statistical Analysis 

To answer the research questions, various statistical analyses were performed. Descriptive statistics and 
chi-square analysis for all self-assessment statements of the DIALANG scales were performed. In 
addition, descriptive statistics for listening, reading, and writing skills and participants' levels were 
calculated for the TEFL, ELL and ELT students. Further, to investigate for a statistically significant 
difference in the participants' responses to the three parts of DIALANG scales, Kruskal-Wallis Test and 
ANOVA were performed. 


RESULTS 


Statements on DIALANG Scales 

In order to determine which items received more positive replies, the frequency and percentage of the 
participants' responses for each item in the listening, reading, and writing skills scales were calculated. 
With regard to listening skill, Item 3, ’Understanding questions and instructions and following short, 
simple directions’ generated the highest frequency (f = 65), while Item 42, ’Following films which contain 
a considerable degree of slang and idiomatic usage,’ generated the lowest frequency (f = 32). 
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Considering the chi-square and p-values of the listening scale statements, the frequency occurrences of 
the participants' yes/no responses were significantly different in most of the items except the following: 
Item 30, 'Understanding announcements and messages on concrete and abstract topics spoken in 
standard language at normal speed' (x2 = 2.12, p = .146); Item 37, 'Following extended speech even it is 
not clearly structures and when relationships between ideas are only implied and not stated explicitly' 
(x2 = 0.00, p = 1.000); Item 38, 'Following most lectures, discussions and debates with relative ease' (x2 
= 0.00, p = 1.000); Item 39, 'Extracting specific information from poor quality public announcements' (x2 
= 2.12, p = .146); Item 41, 'Understanding a wide range of recorded audio material and identifying finer 
points of detail' (x2 = 2.12, p = .146); Item 42, 'Following films which contain a considerable degree of 
slang and idiomatic usage' (x2 = 0.24, p = .628); and Item 43, 'Following specialized lectures and 
presentations which use a high degree of colloquialism, regional usage or unfamiliar terminology' (x2 = 
0.06, p = .808). The frequency occurrences of participants' yes/no responses were not statistically 
different. This finding indicates that, statistically, the participants held the same viewpoints regarding 
these statements. 

Related to reading skill, the two statements of A1 level: Item 2, 'Understanding very short, simple texts, 
putting together familiar names, words, and basic phrases', and Item 3, 'Following short, simple written 
instructions, especially if they contain pictures', and two statements of A2 level: Item 7, 'Understanding 
short, simple texts written in common everyday language' and Item 11, 'Understanding short, simple 
personal letters' received the highest frequency (f = 64), whereas the only statement of C2 level, 
'Understanding and interpreting practically all forms of written language', received the lowest frequency 
(f = 34). In addition, the frequency occurrences of participants' yes/no responses were significantly 
different in 28 statements, except for Item 27, 'Indentifying the content and relevance of news items, 
articles and reports on a wide range of professional topics' (x2 = 3.76, p = .052), Item 30, 'Understanding 
long, complex instructions on a new machine or procedure even outside area of specialty' (x2 = 3.76, p = 
.052), and Item 31, 'Understanding and interpreting practically all forms of written language' (x2 = 0.00, 

p =1.000). 

With regard to writing statements, 'Writing simple notes to friends' and 'Writing very simple personal 
letters expressing thanks and apology' experienced the highest frequency (f = 65). Item 31, 'Producing 
clear, smoothly flowing, complex reports, articles or essays' received the lowest frequency. The 
frequency occurrences of the participants' yes/no responses were significantly different in most of the 
items except for Item 23, 'Constructing a chain of reasoned argument' (x2 = 0.53, p = 0.467); Item 24, 
'Speculating about causes, consequences and hypothetical situations' (x2 = 0.94, p = .332); Item 25, 
'Expressing and supporting points of view at some length with subsidiary points, reasons and relevant 
examples' (x2 = 0.53, p = .467); Item 26, 'Developing an argument systematically, giving appropriate 
emphasis to significant points, and presenting relevant supporting details' (x2 = 0.53, p = .467); Item 27, 
'Giving clear detailed descriptions of complex subjects' (x2 = 0.00, p = 1.000); Item 30, 'Providing an 
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appropriate and effective logical structure' (x2 = 2.12, p = .146); Item 31, 'Producing clear, smoothly 
flowing, complex reports, articles or essays' (x2 = 0.94, p = .332); Item 32, 'Writing so well that native 
speakers need not check my texts'(x2 = 0.53, p = .467); and Item 33, 'Writing so well that my texts 
cannot be improved significantly even by teachers of writing' (x2 = 0.53, p = .467). 

Comparison of Scales Across the TEFL, ELL, and ELT Majors 

Descriptive statistics for the listening scale levels for the three majors are reported in Table 2. 

Table 2. Means (M) and Standard Deviations (SD) for Listening Scale Levels Based on Major 


Major 

Levels 


TEFL 


ELL 



ELT 


M 

SD 

M 


SD 

M 

SD 

A1 

3.65 

0.77 

3.86 


0.35 

3.83 

0.38 

A2 

9.17 

1.72 

9.09 


1.19 

8.78 

2.06 

B1 

8.87 

1.74 

9.18 


1.81 

8.09 

2.71 

B2 

6.13 

2.13 

6.73 


2.88 

5.61 

3.05 

Cl 

4.13 

2.89 

6.91 


2.74 

5.48 

2.84 

C2 

0.35 

0.48 

0.50 


0.51 

0.70 

0.47 


Students from different majors did not report equal capabilities about the 'can do' statements of the 
listening scale. The highest capabilities (M = 9.18) were reported by the students of ELL for the B1 
statements, while the lowest (M = 0.35) were reported by the TEFL students for the statements of the 
C2 level. The ELL students' responses to the statements of the A1 level were the most homogeneous (SD 
= 0.35), while the ELT students' responses to the B2 level statements were the most heterogeneous (SD 
= 3.05). 

Further, ELL students reported that they were more at ease with 'can do' statements of the A1 level, 
while TEFL students reported lowest abilities (M = 3.65). With regard to A2 Level statements, TEFL 
students gained the highest mean score (M = 9.17), whereas ELT students received the lowest mean 
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score (M = 8.78). Related to B1 statements, the mean score for the ELL students was the highest (M = 
9.18), whereas the mean score for the ELT students was the lowest (M = 8. 09). Similarly, ELL students 
reported more capabilities (M = 6.73) in the statements at the B2 level, while ELT students reported 
lowest abilities (M = 5.61). The highest proportion of the ELL students' overall positive replies (M = 6.91) 
was reported for the statements of Cl level; the lowest one (M = 4.13) was reported by the TEFL 
students. Concerning statements of the C2 level, ELT students received the highest mean score (M = 
0.70), while TEFL students received the lowest mean score (M = 0.35). Table 3 shows that learners had 
different abilities with respect to the six levels for the reading scale. 

Table 3. Means (M) and Standard Deviations (SD) for Reading Scale Levels Based on Major 


Major 


Levels TEFL Literature Translation 



M 

SD 

M 

SD 

M 

SD 

A1 

4.61 

1.07 

4.73 

0.63 

4.52 

0.94 

A2 

7.87 

1.93 

8.00 

1.71 

7.09 

2.59 

Bl 

6.96 

1.69 

7.09 

2.97 

5.87 

2.43 

B2 

4.04 

1.63 

4.36 

2.03 

3.83 

2.14 

Cl 

1.35 

0.77 

1.50 

0.67 

1.43 

0.72 

C2 

0.39 

0.49 

0.55 

0.51 

0.57 

0.50 


ELL students gained the highest mean score (M = 8.00) in the statements of A2 level, while TEFL 
students received the lowest mean score (M = 0.39) in the C2 statements. The most homogeneous 
responses (SD = 0.49) occurred with statements of the C2 level as reported by TEFL students, whereas 
the most heterogeneous ones (SD = 2.97) were related to the ELL students. 

At the A1 level, the mean score for the ELL students was highest (M = 4.73), while scores for the ELT 
students was lowest (M = 4.52). The same trend appeared for the statements of the A2, Bl, and the Cl 
levels. In other words, ELL students reported highest abilities in these levels, while ELT students 
reported lowest abilities: A2: M L= 8.00, M T = 7.09; Bl: M L = 7.09, M T = 5.87; B2: M L = 4.36, M T = 
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3.83. However, at the Cl level, ELL students reported to be most capable (M = 1.50), while TEFL students 
reported lowest abilities. For statements at the C2 level, ELT students gained the highest mean score (M 
= 0.57), while TEFL students received the lowest one (M = 0.39). Table 4 shows that learners had 
different abilities with respect to the six levels for the writing scale. 

Table 4. Means (M) and Standard Deviations (SD) for Writing Scale Levels Based on Major 

Major 


Levels TEFL 


Literature 


Translation 



M 

SD 

M 

SD 

M 

SD 

Al 

5.52 

0.94 

5.55 

0.96 

5.74 

2.66 

A2 

6.39 

1.03 

5.91 

1.68 

5.48 

1.88 

Bl 

5.83 

1.37 

5.82 

1.86 

4.91 

2.39 

B2 

2.17 

1.07 

2.77 

1.34 

2.43 

1.75 

Cl 

2.39 

1.40 

3.14 

1.80 

3.22 

1.97 

C2 

1.30 

1.36 

2.36 

1.64 

2.17 

1.64 


TEFL students rated their writing abilities at the A2 level the highest (M = 6.39), while their self-ratings of 
the C2 level statements received the lowest mean score (M = 1.30). The most homogeneous responses 
(SD = 0.94) were related to the A1 level among TEFL students' responses, while the most heterogeneous 
ones were reported by ELT students at the A1 level. Overall, the participants of this study assessed their 
writing ability in the following order: TEFL (A2, Bl, Al, Cl, B2, C2); ELL students (A2, Bl, Al, Cl, C2, B2); 
ELT students (Al, A2, Bl, Cl, B2, C2). 

Participants' Total Score on the Three Scales of DIALANG 

Before investigating whether there were any statistically significant differences in the listening and 
reading skills scales for the three groups, tests of normality were conducted. Results showed that the 
listening and reading scores violated the assumption of normality (p < .05). Therefore, in order to 
compare the participants relative to these two scales, a Kruskal-Wallis (K-W) test was conducted. Results 
are presented in Tables 5 and 6. 
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Table 5. K-W Test for the Listening Scale Statements 


2014 


Skill 


Major 


N 


Mean 


Rank 


Chi-Square df 


TEFL 23 28.63 5.366 2 .068 


Listening Literature 22 42.05 


Translation 23 33.15 


ELL students had the highest overall ranking (Mean rank = 42.05), whereas TEFL students received the 
lowest overall ranking (Mean Rank = 28.63). In addition, a K-W test did not reveal a statistically 
significant difference in the listening statements across the three majors (TEFL group, n = 23, ELL group, 
n = 22, ELT group, n = 23), \2 (2, n = 68) = 5.366, p = .068. 

Table 6. K-W Test for the Reading Scale Statements 


Skill 

Major 

N 

Mean Rank 

Chi-Square 

df 

P 


TEFL 

23 

34.48 

1.187 

2 

.552 

Reading 

Literature 

22 

37.77 





Translation 

23 

31.39 





Results indicated that ELL students had the highest scores (Mean Rank = 37.77), with the ELT students 
reporting the lowest scores (Mean Rank = 31.39). The results of K-W test showed that there was no 
statistically significant difference in the reading statements across the three majors (TEFL group, n = 23, 
ELL group, n = 22, ELT group, n = 23), \2 (2, n = 68) = 1.187, p = .552. 

The normality tests for the writing scale were also assessed. Results indicated that the writing scores of 
all majors did not violate the assumption of normality (p > .05). Therefore, in order to compare the 
participants' viewpoint with regard to writing statements, ANOVA test was conducted. ANOVA results 
are presented in Table 7. 
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Table 7. Means and Standard Deviations of Students' Self-Assessment of their Writing Ability 


TEFL 


Literature 

Translation 

M 

SD 

M SD 

M SD 

Writing 23.61 

3.76 

25.55 6.80 

23.96 9.51 


ELL students reported more capabilities based on the writing statements, whereas TEFL students 
reported the less abilities (M = 23.61) in this regard. Table 7 also shows that the most homogeneous 
responses were generated by the TEFL students, while the most heterogeneous ones were from ELT 
students. As described in Table 8, one-way between-groups ANOVA did not show a statistically 
significant difference in the writing self-assessment statements for the three groups. 

Table 8. Comparing the Participants' Scores on the Writing Scale 


Writing 

Sum of Squares 

df 

Mean Square F 

P 

Between 

Groups 

47.640 

2 

23.820 .473 

.625 

Within 

Groups 

3275.889 

65 

50.398 


Total 

3323.529 

67 




DISCUSSION 

In spite of the six-year compulsory English education at the secondary level, and at least three years of 
English language courses at the post-secondary level, the undergraduate students in this study did not 
report high language skill ability, and their scores for listening, reading and writing were quite low. The 
researchers believe that these results may be related to a few factors. First, before entering university, 
language courses are generally focused on reading, and no systematic instruction is offered to master 
listening, writing, or speaking skills. Large class sizes may be another reason for the low scores, because 
students may not receive the individualized instruction and attention they need to develop language 
skills. 
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In addition, traditional teacher-centered teaching methods, lack of standards for language proficiency, 
and the lack of predetermined, concrete learning outcomes may account for the problems with English 
language instruction at this university. Moreover, students' assessment is mostly summative, in which 
learning outcomes are reported as a single score and as a result, they are not often asked to evaluate 
their abilities or to be involved in the self-assessment in their language learning process. 

Technology and online learning opportunities may provide additional learning opportunities for students 
at this university, however, both are currently under-utilized. In the absence of technology and Web- 
based learning opportunities, an alternative means for assessing language proficiency and early 
identification of deficiencies that is provided by DIALANG may be warranted. The researchers believe 
that DIALANG self-assessment may help students in this university take control of their language 
learning because of its many features. Features offered by DIALANG include a learner feedback and 
identification of language deficiencies, which can help learners diagnose their weaknesses and strengths 
and increase their awareness of their language levels. 

Study Limitations 

The small sample size makes findings not generalizable beyond this post-secondary institution where 
the study was conducted. Further, given the researchers' relationships with participants as colleagues 
and administrators, it is not known whether responses were guarded or offered in order to be socially 
desirable. Participant availability was another limitation of this study. Additionally, factors such as 
sociocultural background, L2 proficiency level, gender, and age were not taken into account and may 
have provided additional insights about why particular scores were obtained. 

Conclusion 

This study aimed to investigate L2 learners’ level in the listening, speaking, and writing skills in terms of 
'can do' statements of the DIALANG Framework. To this end, undergraduate students of the English 
language were asked to rate their abilities based on the self-assessment statements of DIALANG. The 
results of the study revealed that the participants of the study rated their language abilities in the 
following order: TEFL students (reading, listening, writing); ELL students (listening, reading, writing); and 
ELT students (listening, reading, writing). Although ELL students gained the highest mean score for all 
three skills, there was not any statistically significant difference in the reading, writing, and listening 
statements across the three majors. 

Language teachers in both online and face-to-face classes should promote learners' ability to self-assess 
and to reflect on their language proficiency level in order to help them facilitate the process of language 
learning and formulate and analyze the steps needed to achieve the learning goals. It is suggested that 
teachers implement self-assessment and introduce DIALANG statements as part of language instruction, 
and they train students to conduct self-assessment based on the 'can do' statements. Students in both 
language learning environments can also evaluate their progress in the language skills based on the 'can 
do' statements and then can formulate specific goals for their future progress. They can examine how 
well they have attained the learning goals for the course based on the 'can do' statements. To improve 
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learners' language skills, materials developers can also consider language proficiency standards offered 
in the DIALANG Framework to develop teaching materials and textbooks, as it is believed that DIALANG 
provides learners with concrete learning outcomes, which is one of the essential issues in the course 
design. 

It is suggested that research be conducted comparing the language proficiency level of both online 
students and those who study in traditional language classes. After that, based on the results of the 
DIALANG test, a comparison can be made in terms of students' strengths and weaknesses in these two 
educational environments. In future study, learners in both face-to-face and online language classes can 
receive training based on the 'can do' statements and then put in charge of rating their own 
performance. Then, the impact of this training on the language proficiency of these learners can be 
compared. The relationship between self-assessment in terms of DIALANG statements and factors such 
as personality traits, learning anxiety, locus of control, and the cognitive style merits further inquiry. 
Future researchers can also use more qualitative and in-depth interview with learners and tutors about 
L2 performance with respect to DIALANG test and its 'can do' statements. 
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APPENDIX A 


Proficient 

C2 

Can understand with ease virtually everything heard or read. Can summarize 
information from different spoken and written sources, reconstructing 
arguments and accounts in a coherent presentation. Can express him/herself 
spontaneously, very fluently and precisely, differentiating finer shades of 
meaning even in more complex situations. 

User 

Cl 

Can understand a wide range of demanding, longer texts, and recognize 
implicit meaning. Can express him/herself fluently and spontaneously 
without much obvious searching for expressions. Can use language flexibly and 
effectively for social, academic and professional purposes. Can produce clear, 
well-structured, detailed text on complex subjects, showing controlled use of 
organizational patterns, connectors and cohesive devices. 

Independent 

User 

B2 

Can understand the main ideas of complextext on both concrete and 
abstract topics, including technical discussions in his/her field of 
specialization. Can interact with a degree of fluency and spontaneity that 
makes regular interaction with native speakers quite possible without strain 
for either party. Can produce clear, detailed text on a wide range of subjects 
and explain a viewpoint on a topical issue giving the advantagesand 
disadvantages of various options. 


B1 

Can understand the main points of clear standard input on familiar matters 
regularly encountered in work, school, leisure, etc. Can deal with most 
situations likely to arise whilst travelling in an area where the language is 
spoken. Can produce simple connected text on topics which are familiar or of 
personal interest. Can describe experiences and events, dreams, hopes and 
ambitions and briefly give reasons and explanations for opinions and plans. 

Basic 

User 

A2 

Can understand sentences and frequently used expressions related to areas 
of most immediate relevance (e.g. very basic personal and family information, 
shopping, local geography, employment). Can communicate in simple and 
routine tasks requiring a simple and direct exchange of information on 
familiar and routine matters. Can describe in simple terms aspects of his/her 
background, immediate environment and matters in areas of immediate 
need. 


A1 

Can understand and use familiar everyday expressions and very basic phrases 
aimed at the satisfaction of needs of a concrete type. Can introduce 
him/herself and others and can ask and answer questions about personal 
details such as where he/she lives, people he/she knows and things he/she has. 

Can interact in a simple way provided the other person talks slowly and clearly 
and is prepared to help. 


Obtained on 11-18-14 from: https://www.eui.eu/Documents/ServicesAdmin/LanguageCentre/CEF.pdf 
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